Learning Parsimonious Classification Rules from Gene Expression Data Using Bayesian Networks with Local Structure

نویسندگان

  • Jonathan Lyle Lustgarten
  • Jeya Balaji Balasubramanian
  • Shyam Visweswaran
  • Vanathi Gopalakrishnan
چکیده

The comprehensibility of good predictive models learned from high-dimensional gene expression data is attractive because it can lead to biomarker discovery. Several good classifiers provide comparable predictive performance but differ in their abilities to summarize the observed data. We extend a Bayesian Rule Learning (BRL-GSS) algorithm, previously shown to be a significantly better predictor than other classical approaches in this domain. It searches a space of Bayesian networks using a decision tree representation of its parameters with global constraints, and infers a set of IF-THEN rules. The number of parameters and therefore the number of rules are combinatorial to the number of predictor variables in the model. We relax these global constraints to a more generalizable local structure (BRL-LSS). BRL-LSS entails more parsimonious set of rules because it does not have to generate all combinatorial rules. The search space of local structures is much richer than the space of global structures. We design the BRL-LSS with the same worst-case time-complexity as BRL-GSS while exploring a richer and more complex model space. We measure predictive performance using Area Under the ROC curve (AUC) and Accuracy. We measure model parsimony performance by noting the average number of rules and variables needed to describe the observed data. We evaluate the predictive and parsimony performance of BRL-GSS, BRL-LSS and the state-of-the-art C4.5 decision tree algorithm, across 10-fold cross-validation using ten microarray gene-expression diagnostic datasets. In these experiments, we observe that BRL-LSS is similar to BRL-GSS in terms of predictive performance, while generating a much more parsimonious set of rules to explain the same observed data. BRL-LSS also needs fewer variables than C4.5 to explain the data with similar predictive performance. We also conduct a feasibility study to demonstrate the general applicability of our BRL methods on the newer RNA sequencing gene-expression data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Introduction to Inference and Learning in Bayesian Networks

Bayesian networks (BNs) are modern tools for modeling phenomena in dynamic and static systems and are used in different subjects such as disease diagnosis, weather forecasting, decision making and clustering. A BN is a graphical-probabilistic model which represents causal relations among random variables and consists of a directed acyclic graph and a set of conditional probabilities. Structure...

متن کامل

 Structure Learning in Bayesian Networks Using Asexual Reproduction Optimization

A new structure learning approach for Bayesian networks (BNs) based on asexual reproduction optimization (ARO) is proposed in this letter. ARO can be essentially considered as an evolutionary based algorithm that mathematically models the budding mechanism of asexual reproduction. In ARO, a parent produces a bud through a reproduction operator; thereafter the parent and its bud compete to survi...

متن کامل

INTEGRATED ADAPTIVE FUZZY CLUSTERING (IAFC) NEURAL NETWORKS USING FUZZY LEARNING RULES

The proposed IAFC neural networks have both stability and plasticity because theyuse a control structure similar to that of the ART-1(Adaptive Resonance Theory) neural network.The unsupervised IAFC neural network is the unsupervised neural network which uses the fuzzyleaky learning rule. This fuzzy leaky learning rule controls the updating amounts by fuzzymembership values. The supervised IAFC ...

متن کامل

A Surface Water Evaporation Estimation Model Using Bayesian Belief Networks with an Application to the Persian Gulf

Evaporation phenomena is a effective climate component on water resources management and has special importance in agriculture. In this paper, Bayesian belief networks (BBNs) as a non-linear modeling technique provide an evaporation estimation  method under uncertainty. As a case study, we estimated the surface water evaporation of the Persian Gulf and worked with a dataset of observations ...

متن کامل

A Surface Water Evaporation Estimation Model Using Bayesian Belief Networks with an Application to the Persian Gulf

Evaporation phenomena is a effective climate component on water resources management and has special importance in agriculture. In this paper, Bayesian belief networks (BBNs) as a non-linear modeling technique provide an evaporation estimation  method under uncertainty. As a case study, we estimated the surface water evaporation of the Persian Gulf and worked with a dataset of observations ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2017